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(54) System and method for multimodal interactive speech and language training 



(57) A system and method for multimodal interactive 
speech training include selecting a modality (10) corre- 
sponding to various sensory stimuli to present non-na- 
tive vocabulary elements (1 2) to an individual to train 
the individual to immediately respond (16) to a present- 
ed word, situation, or data without performing a time- 
consuming literal translation or other complex cognitive 
process. The system and method include speech syn- 
thesis, speech recognition, and visual representations 
of non-native vocabulary elements to promote rapid 
comprehension through neuro-linguistic programming 
of the individual. 
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Description 

Technical Field 

The present invention relates to a system and meth- 
od for speech and language training which utilizes mul- 
tiple modalities and multiple sensory interaction to teach 
key vocabulary elements such as phonemes, words, 
and phrases to evoke a reflexive response. 

Background Art 

Theories of human information processing and 
learning are continually emerging and evolving, first in 
the psychological sciences and later in computer sci- 
ence in an effort to model human intelligence in one form 
or another. At least one theory of human information 
processing associates various processing events or 
tasks with a corresponding time which increases by a 
factor of about ten for each level of task. The lowest level 
tasks are performed most rapidly and involve purely 
physical and chemical reactions to produce a response. 
For example, a reflexive act is constrained by the time 
it takes for a stimulus to provoke an action potential in 
a neuron followed by a synaptic transfer to the central 
nervous system which evokes a response in the motor 
system, such as a muscle movement. The individual has 
no control over such actions (a knee-jerk reflex reaction 
for example) which occur on the order of about 10 mS. 

Deliberate actions are those which use available 
knowledge to choose a particular response over other 
available responses. Such actions require additional 
time to complete - on the order of 100 mS to one sec- 
ond. Deliberate acts may be further characterized as au- 
tomatic acts or controlled acts with automatic acts re- 
quiring the former time to complete and controlled acts 
requiring the latter. Empirical data support this distinc- 
tion and suggest that automatic behavior occurs by 
largely parallel neurological processing while controlled 
behavior occurs by largely sequential neurological 
processing. This is supported by the constraint imposed 
by the underlying biological processes involved in trans- 
mission of neurological signals. 

More demanding cognitive tasks require assem- 
bling a collection of deliberate acts in order to compose 
an appropriate response to a particular stimulus. These 
tasks may require one to ten seconds or more. The time 
required for such tasks is a function of the time required 
to recall appropriate knowledge and apply it to the stim- 
ulus. This time may often be reduced with practice as 
suggested by the power law of practice (the logarithm 
of the response time varies as a function of the logarithm 
of the number of trials). 

Language processsing of non-native vocabulary el- 
ements or unfamiliar vocabulary elements in one's na- 
tive language is one example of a demanding cognitive 
process which requires a processing time on the order 
of second. This process may involve perception of an 



unfamiliar vocabulary element, memory recall to identify 
the vocabulary element and associate it with a corre- 
sponding familiar vocabulary element, determination of 
an appropriate response, and memory recall to associ- 
s ate the appropriate response with a corresponding un- 
familiar vocabulary element. In contrast, a stimulus 
which triggers a familiar element may evoke a deliberate 
automatic response which may be performed in a sec- 
ond or less depending upon the particular situation, 
io since the more complex cognitive tasks of assembling 
deliberate acts is not required. 

While realization of a global marketplace may elim- 
inate barriers to travel and trade, fundamental commu- 
nication skills are essential but continue to hinder 
*5 progress toward that goal. Business transactions may 
not be significantly impacted by language obstacles due 
to the availability of translators in the form of bilingual 
individuals, computer systems, or pocket dictionaries. 
However, a number of individuals are required to per- 
20 form time-critical tasks which must transcend the hur- 
dles imposed by different languages or unfamiliar terms 
specific to a particular job or environment, i.e. jargon. 
For example, air traffic controllers, pilots, law enforce- 
ment personnel, military personnel, and the like perform 
25 numerous time-critical tasks which demand a correct, 
immediate response to verbal, written, or other graphi- 
cal communications which may be in languages other 
than their native language. These individuals must often 
respond immediately to a presented word, situation, or 
30 data without waiting for a literal language translation or 
another time-consuming cognitive process. 

A number of professions, including those men- 
tioned above, also include a significant number of vo- 
cabulary elements which are specific to the profession 
35 or the geographical region. Experienced individuals are 
capable of assimilating these terms such that the terms 
become familiar enough to elicit an immediate accurate 
response if necessary. Orientation of new individuals to 
unfamiliar vocabulary terms which may be used in these 
40 situations, such as jargon or slang, may require a sig- 
nificant period of time. 

A number of prior art language or vocabulary train- 
ing programs present material using multi-sensory 
methods but focus on traditional language learning the- 
45 ories which include semantic, syntactic, and grammatic 
memorization. These methods typically require a signif- 
icant amount of time to teach fundamental communica- 
tion skills, especially for older students. Furthermore, 
once the unfamiliar vocabulary and "rules" are memo- 
50 rized, literal translation requires significant cognitive 
processing which slows reaction time. 

By excluding speech recognition, prior art training 
systems eliminate vital cognitive tasks which may lead 
toward increased comprehension and communication 
55 skills. These systems are often rigidly structured and do 
not support the dynamics essential to individual learn- 
ing. 
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Disclosure Of The Invention 

It is thus an object of the present invention to pro- 
vide a system and method for multimodal interactive 
speech and language training which incorporate senso- 
ry preview, instructional comprehension and interactive 
memory recall exercises based upon proven sensory in- 
tegration exercises. 

A further object of the present invention is to provide 
a system and method for vocabulary element training 
which include a flexible structure which supports the dy- 
namics essential to individual learning. 

Another object of the present invention is to provide 
a system and method for teaching unfamiliar vocabulary 
elements which allow individuals having various native 
languages to quickly learn to respond correctly to non- 
native vocabulary elements. 

A still further object of the present invention is to 
provide a system and method for speech and language 
training which use proven sensory-motor interaction ex- 
ercises to increase the learning efficiency in addition to 
improving memory recall and perception skills. 

Yet another object of the present invention is to pro- 
vide a system and method for instantiation of unfamiliar 
vocabulary elements which utilize digitized and com- 
pressed human speech, synthesized speech, and 
speech recognition in addition to visual and tactile inter- 
faces to excite multiple neural paths in receiving and ex- 
pressing vocabulary elements and comprehension 
checks. 

A still further object of the present invention is to 
provide a system and method for speech and language 
training which use sensory-motor interaction exercises 
to promote rapid training in the correct immediate reac- 
tion to the introduction of an unfamiliar vocabulary ele- 
ment whether presented aurally, or visually as a graph- 
ical or textual representation. 

In carrying out the above objects and other objects 
and features of the present invention, a system and 
method for training an individual to respond to unfamiliar 
vocabulary elements include selecting at least one of a 
plurality of modalities corresponding to different sensory 
stimuli to present unfamiliar vocabulary elements to the 
individual and presenting the unfamiliar vocabulary ele- 
ments using the selected modality. The system and 
method also include pausing modality. The system and 
method also include pausing for a time sufficient for the 
individual to respond to the vocabulary element, receiv- 
ing a response from the individual based on the unfa- 
miliar element, evaluating the response based on a pre- 
determined desired response criteria, and providing 
feedback to the individual using at least one of the plu- 
rality of modalities. 

The present invention provides a system and meth- 
od for instantiation of correct responses to spoken, writ- 
ten, or graphically symbolic vocabulary elements such 
as phonemes, words, and phrases preferably using 
computer-based training designed to produce neuro-lin- 



guistic programming in the student. In a preferred em- 
bodiment, a system or method according to the present 
invention utilizes a multi-modal and multi-sensory. Pre- 
view Phase followed by a Word Comprehension Phase 
5 and a Memory Recall Phase. The Preview Phase in- 
cludes options for completing new exercises in addition 
to review exercises which include Element Training, 
Graphic Training, and Sight Recognition exercises. The 
word Comprehension Phase includes sentence com- 
ic pletion, graphic identification, and word choice exercis- 
es while the Memory Recall Phase includes Graphic 
Memory, Word Recognition, and Story Completion ex- 
ercises. 

In a preferred embodiment, a specially designed, 
is interactive, PC-based multimedia training system in- 
cluded control logic or software to implement a method 
according to the present invention. 

The advantages accruing to the present invention 
are numerous. For example, the present invention al- 
20 lows any individual, regardless of education or experi- 
ence to rapidly gain the necessary skilles to perform 
time-critical response tasks for his job. The present in- 
vention provides a system and method which are com- 
pletely self-paced and individualized. Speech synthesis 
25 provides an audio stimulus while speech recognition 
recognizes the correctness of the student response. In- 
corporation of proven sensory integration techniques in- 
creases the learning efficiency while improving memory 
recall and perception skills. Positive feedback improves 
30 student motivation by using the student's name in re- 
sponses as well as colorful graphics to inspire the stu- 
dent to speak the unfamiliar vocabulary elements as ac- 
curately as possible. 

The above objects and other objects, features, and 
35 advantages of the present invention will be readily ap- 
preciated by one of ordinary skill in this art from the fol- 
lowing detailed description of the best mode for carrying 
out the invention when taken in connection with the ac- 
companying drawings. 

40 

Brief Description Of The Drawings 

Figure 1 is a flow chart illustrating a system or meth- 
od for multimodal speech and language training ac- 
45 cording to the present invention; 

Figure 2 is a more detailed flow chart/block diagram 
illustrating presentation of unfamiliar vocabulary el- 
ements to an individual; 

so 

Figure 3 is a block diagram illustrating a system for 
multimodal speech and language training accord- 
ing to the present invention; 

55 Figure 4 illustrates one example of a computer 
screen used in one embodiment of a training sys- 
tem according to the present invention; and 
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Figure 5 illustrates another example of a computer 
screen used in one embodiment of a training sys- 
tem according to the present invention. 

Best Mode(s) For Carrying Out The Invention 

Referring now to Figure 1 , a flowchart illustrating a 
system and method for multimodal interactive training 
is shown. While the flow chart illustrates a sequential 
process, one of ordinary skill in the art will recognize that 
a system or method according to the present invention 
may be implemented using other techniques without de- 
parting from the spirit or scope of the present invention. 
While a preferred embodiment of the present invention 
is implemented by a programmed computer, the present 
invention transcends the particular hardware platform, 
operating system, programming method, and program- 
ming language, as indicated by the appended claims. 

As illustrated in Figure 1 , one or more of a plurality 
of modalities is selected to present an unfamiliar vocab- 
ulary element to an individual as represented by block 
1 0. Regardless of whether an individual has one or more 
native languages, the present invention provides a sys- 
tem and method for rapid acquisition of unfamiliar vo- 
cabulary elements whether in a native language or a 
non-native language by implanting the unfamiliar vocab- 
ulary elements into the individual's established neuro- 
linguistic syntax and linking them to a correct mental or 
motor response. 

An unfamiliar vocabulary element is presented to 
the individual using one or more modalities selected in 
block 10 as represented by block 12 of Figure 1. A vo- 
cabulary element may be a phoneme, word, phrase, 
sentence, or paragraph. Preferably, a vocabulary ele- 
ment, such as a word or phrase, is presented both vis- 
ually and aurally to the individuals. The visual presen- 
tation may include displaying a word or phrase on pres- 
entation may include splaying a word or phrase on a vid- 
eo screen within a sentence. The aural presentation 
preferably includes a synthesized utterance corre- 
sponding to the vocabulary element. This process is il- 
lustrated and explained in greater detail with reference 
to Figure 2 below. Also preferably, the presentation is 
divided into multiple lessons incorporating new vocab- 
ulary elements with previously presented (review) vo- 
cabulary elements. 

The individual is given a period of time to initiate a 
response as indicated by block 14. If no response is in- 
itiated within the time period, the process may continue 
by presenting additional vocabulary elements or may 
provide a prompt using more familiar vocabulary (which 
may be in the user's native language for foreign lan- 
guage applications) to assist the user in understanding 
the unfamiliar vocabulary etement. In a preferred em- 
bodiment, vocabulary elements for which an incorrect 
response or no response is given are added to a review 
list and presented at a later date or time. 

with continuing reference to Figure 1 , the user's re- 



sponse is received as represented by block 16. The user 
may respond using one or more input devices as illus- 
trated and described in detail with reference to Figure 
3. For example, the user may pronounce the vocabulary 
5 element while pointing to a visual representation of the 
vocabulary element. This improves the efficiency of 
learning and ability of recall due to multi-sensory inte- 
gration. The response (or responses) are evaluated for 
correctness as indicated by block 18 and appropriate 
10 feedback is presented to the user based on the correct- 
ness of the response as indicated by block 20. 

In a preferred embodiment, the feedback includes 
both visual and aural feedback. Visual feedback is pro- 
vided by a needle gauge at the bottom of the screen 
is which indicates the degree of correct pronunciation. The 
aural feedback is coupled with the visual feedback and 
includes a synthesized voice which speaks the user's 
name along with an encouraging response such as 
"Ron, that's close, let's try it again." The user is given 
three opportunities to correctly say the utterance before 
moving on to the next vocabulary element. An incorrect 
response is preferably recorded in a review file to be 
repeated until mastery occurs. Sample screens are il- 
lustrated and described with reference to Figures 4 and 
5. This process may be repeated for a number of unfa- 
miliar vocabulary elements which are preferably 
grouped in lessons having nine new elements. 

Referring now to Figure 2, a flow chart/block dia- 
gram illustrates a preferred embodiment of a system or 
method for multimodal interactive training according to 
the present invention. As illustrated, each lesson or ex- 
ercise is divided into three main sections including a 
Multimodal Multisensory Preview 30, Word Comprehen- 
sion 38, and Memory Recall 46. Multisensory preview 
30 teaches and introduces an unfamiliar vocabulary el- 
ement, such as a word, along with a definition and a 
graphical representation, such as a picture. 

Multisensory preview 30 prepares the individual for 
the unfamiliar vocabulary elements to be learned. Re- 
search has shown that a preview of the material greatly 
increases comprehension and retention when instruc- 
tion follows such a preview phase. Preferably, each les- 
son includes nine vocabulary elements, such as words 
or phrases, which may be selected from new vocabulary 
elements or previously presented vocabulary elements. 
Each third lesson is preferably a complete review of pre- 
viously presented vocabulary elements. 

Multisensory preview 30 includes Element Training 
32, Graphic Training 34, and Sight Recognition 36. As 
with Word Comprehension 38 and Memory Recall 46, 
Multisensory Preview 30 includes a sensory input stim- 
ulus, such as a visual and/or auditory presentation, de- 
signed to evoke a motoric response, such as speech or 
a tactile response. Element training 32 allows the user 
to gain greater comprehension by understanding the 
meaningfulness of the presented vocabulary element. 
Preferably, a visual presentation of the vocabulary ele- 
ment and its context is followed by an auditory presen- 
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tation of the same. The user then provides a speech re- 
sponse by repeating the vocabulary element while view- 
ing the visual presentation. 

Graphic Training 34 is used to promote visualization 
of the unfamiliar vocabulary element and to enhance its 
meaningfulness to the user. Preferably, a visual presen- 
tation of the vocabulary element with an associated 
graphic is followed by an auditory presentation of the 
vocabulary element along with a contextual phrase. The 
user then has the choice of a tactile or speech response 
while viewing the visual presentation. 

Sight Recognition 36 is performed using a tachti- 
scopic flash to trigger visual memory as a sensory input. 
Preferably a visual flash of the unfamiliar vocabulary el- 
ement evokes a speech response which relies on a pho- 
nological memory link to visual recall in the user. 

After finishing exercises for the Multisensory Pre- 
view 30, Word Comprehension exercises are performed 
as represented by block 38 of Figure 2. Word Compre- 
hension 38 provides practice for learning vocabulary el- 
ement recognition, definition and usage. Word Compre- 
hension 38 includes Sentence Completion 40, Graphic 
Identification 42, and Word Choice 44. Sentence Com- 
pletion 40 includes presentation of a sentence with an 
omitted vocabulary element or elements. The user must 
select the correct answer from a field of nine possibili- 
ties. The user may activate a "hear" icon or button (best 
illustrated in Figures 4 and 5) to hear the sentence pro- 
duced by a synthesized voice. After the first incorrect 
response, a solid line for the missing element changes 
to a correct number of dashes representing the number 
of letters and spaces. After two incorrect responses, the 
first and last letter appear in the blanks. After three in- 
correct responses, about half of the letters appear in the 
blanks. 

Graphic Identification 42 includes visual presenta- 
tion of three graphics representing three unfamiliar vo- 
cabulary elements. An aural presentation of one of the 
elements prompts the user to respond. The aural pres- 
entation may be of the element itself or its definition, as 
selected by the user or randomly selected by the sys- 
tem. The user then selects the correct graphic using an 
input device (best illustrated in Figure 3). Regardless of 
the response, three new graphics are then visually pre- 
sented in random order. This cycle continues until a cor- 
rect response has been entered for all of the nine graph- 
ics. 

Word Choice 44 visually presents all nine vocabu- 
lary elements of the current lesson. An aural presenta- 
tion of the unfamiliar vocabulary element or its definition 
(selectable option) prompts the user to select the corre- 
sponding element with an input device. Again, regard- 
less of response, a new element (or its definition) is pre- 
sented. Correct responses remain on display while ad- 
ditional elements or definitions are presented until a cor- 
rect response is provided for each element. 

Memory Recall 46 of Figure 2 includes additional 
exercises designed to improve memory retention and 



recall after completion of Word Comprehension 38. 
Memory Recall 46 includes exercises for Graphic Mem- 
ory 48, Word Recognition 50, and Story Completion 52. 
Graphic memory 48 includes visual presentation of a 

5 single graphic or word. The user responds by pronounc- 
ing the appropriate word which is analyzed by a speech 
recognizer. Graphics or words are randomly presented 
until ail are correctly identified. Word Recognition 50 in- 
cludes visual presentation of three words or graphics ac- 

10 companied by aural presentation of one of the three. 
The user must select the correct match with an input 
device (best illustrated in Figure 3). Graphics or words 
are randomly presented until a correct response is pro- 
vided for each of the vocabulary elements. 

15 Story Completion 52 includes visual presentation of 
a paragraph with four or five vocabulary elements ab- 
sent. The user can select an aural presentation by acti- 
vating a corresponding icon or button (best illustrated in 
Figures 4 and 5). A number of response modes are 

20 available to the user. The user may select a vocabulary 
element from a list using an input device for each of the 
omitted elements. When the user has finished respond- 
ing, all responses are evaluated. 

Referring now to Figure 3, a block diagram illustrat- 
es ing a system for multimodal interactive training accord- 
ing to the present invention is shown. The system in- 
cludes a processor 60 which includes control logic for 
implementing a method according to the present inven- 
tion as described with reference to Figure 1 and Figure 

30 2. Processor 60 is in communication with input devices 
such as digitizing tablet 62, keyboard 66, microphone 
72, and wireless mouse 78. A video display 64 is used 
to present text and graphics to the individual and is also 
in communication with processor 60. Of course various 

35 other input devices could be used to allow a tactile re- 
sponse from the user without departing from the spirit 
or scope of the present invention. 

With continuing reference to Figure 3, a digitized vo- 
cabulary library 68 may be stored within processor 60 

40 or in an external non-volatile storage media in commu- 
nication with processor 60. Speech recognition device 
70 utilizes microphone 72 to capture and analyze audio 
input from the user. Similarly, speech synthesizer 74 and 
speaker(s) 76 are used to aurally present vocabulary el- 

4$ ements to the user. Of course, various components of 
the system illustrated may be implemented in software, 
hardware, or a combination of software and hardware 
without departing from the spirit or scope of the present 
invention as such implementations are merely a matter 

50 of design choice. For example, speech recognition de- 
vice 70 may include a dedicated digital signal processor 
with associated software to convert audio information 
acquired by microphone 72 into a data stream which is 
communicated to processor 60. Speech recognition de- 

55 vice 70 and speech synthesizer 74 may be located in 
the same chassis as processor 60, sharing a common 
backplane, or may be remotely located near the user 
and communicate via a network. 
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In operation, processor 60 includes appropriate 
control logic preferably in the form of software to present 
visual stimuli, such as text and graphics, on display 64. 
Aural presentation of vocabulary elements utilizes 
speech synthesizer 74 and speaker(s) 76. The user may 
respond using one or more input devices, such as key- 
board 66, mouse 78, tablet 62, or microphone 72. Vo- 
cabulary library 68 includes recorded digitized repre- 
sentations of vocabulary elements which may be used 
by speech recognition device 70 to evaluate the correct- 
ness of a response or by speech synthesizer 74 to form 
an audio representation of the vocabulary elements. 

Figures 4 and 5 illustrate representative screen dis- 
plays which may be presented to the user via display 
64. For ease of illustration, icons are generally repre- 
sented by blank boxes. Of course, the present invention 
is independent of the particular icons selected or the 
particular screen layout illustrated. Rather, the screen 
representations of Figure 4 and 5 illustrate the multimo- 
dal approach to speech and language training of the 
present invention. Primed reference numerals of Figure 
5 indicate items of similar structure and function illus- 
trated and described with reference to Figure 4. 

As shown in Figure 4, a window 80 contains a 
number of graphical representations or icons 82, 84, 86, 
88, 90, 98, and 100. In addition, window 80 includes an 
area for presenting a particular vocabulary element, in- 
dicated generally by reference numeral 96, a graphical 
presentation area 92, and a contextual area 94. Window 
80 also includes control icons 98 and 100 which allow 
manipulation of the display window 80. Similarly, menu 
items 1 02 allow the user to select various options related 
to the training lessons. 

In a preferred embodiment, icon 82 is a microphone 
icon which is activated when the user is asked to provide 
a verbal response. Icon 84 provides visual feedback in 
the form of a confidence meter which indicates the cor- 
rectness of a user response. Activation of icon 86 allows 
the user to repeat presentation of a particular vocabu- 
lary element. 

Activation of icon 88 changes contextual presenta- 
tion from one or more unfamiliar vocabulary elements 
to those which are more familiar. For example, in a for- 
eign language training application, icon 88 may be a flag 
which represents the user's native language. If the user 
is having difficulty comprehending a particular vocabu- 
lary element or graphical representation, activation of 
icon 88 presents a description in the user's native lan- 
guage. Other icons, such as icon 90, may be used to 
indicate progress or level of difficulty of the particular 
exercise. As illustrated in Figures 4 and 5, graphical 
presentation area 92 may be replaced by contextual in- 
formation 110 depending upon the particular lesson or 
exercise being performed. 

It is understood, of course, that while the forms of 
the invention herein shown and described include the 
best mode contemplated for carrying out the present in- 
vention, they are not intended to illustrate all possible 



forms thereof. It will also be understood that the words 
used are descriptive rather than limiting, and that vari- 
ous changes may be made without departing from the 
spirit or scope of the invention as claimed below 

5 

Claims 

1 . A method for training an individual to recognize and 
10 respond to at least one unfamiliar vocabulary ele- 
ment, the method comprising: 

selecting (10) at least one of a plurality of mo- 
dalities corresponding to different sensory stim- 

is uli to present the at least one unfamiliar vocab- 

ulary element to the individual; 
presenting (12) the at least one unfamiliar vo- 
cabulary element using the selected modality; 
pausing (14) for a time sufficient for the individ- 

20 ual to respond to the at least one unfamiliar vo- 

cabulary element; 

receiving (16) a response from the individual 
based on the at least one unfamiliar vocabulary 
element; 

25 evaluating (18) the response based on a pre- 

determined desired response criteria; and 
providing feedback (20) to the individual using 
at least one of the plurality of modalities. 

30 2. The method of claim 1 wherein selecting comprises 
selecting at least one modality from the group con- 
sisting of audio, textual, and graphical representa- 
tions of the at least one unfamiliar vocabulary ele- 
ment. 

35 

3. The method of claims 1 or 2 wherein providing feed- 
back comprises generating a speech signal which 
includes a name of the individual. 

40 4. The method of claims 1 , 2, or 3 wherein presenting 
comprises presenting an audio stimulus including a 
synthesized utterance and a visual representation 
of the at least one unfamiliar vocabulary element. 

45 5. The method of claims 1 , 2, 3, or 4 wherein receiving 
comprises: 

receiving a request from the individual indica- 
tive of difficulty in understanding the at least 
50 one unfamiliar vocabulary element; and 

presenting the at least one unfamiliar vocabu- 
lary element in a context of native vocabulary 
elements. 

55 6. The method of claims 1.2,3, 4, or 5 wherein pro- 
viding feedback comprises providing at least one 
visual and at least one audio indication of correct- 
ness of the response. 
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7. The method of claim 6 wherein the step of receiving 
includes receiving a verbal response from the indi- 
vidual. 

8. A system for training an individual to recognize and $ 
respond to at least one unfamiliar vocabulary ele- 
ment, the system comprising: 

a display device (64) for providing a visual rep- 
resentation of the at least one unfamiliar vocab- *o 
ulary element; 

a speech synthesizer (74) for providing an au- 
dio representation of the at least one unfamiliar 
vocabulary element; 

at least one input device (62,66,70,78) for gen- is 
erating a signal indicative of a response of the 
individual to the at least one unfamiliar vocab- 
ulary element; and 

control logic (60) in communication with the dis- 
play device (64), the speech synthesizer (74) 20 
and the at least one input device (62,66,70,78) 
for selecting at least one of the display device 
(64) and the speech synthesizer (74) to present 
the at least one unfamiliar vocabulary element 
to the individual, presenting the at least one un- 2s 
familiar vocabulary element using the selected 
device, pausing for a time sufficient for the in- 
dividual to generate a response to the at least 
one unfamiliar vocabulary element, evaluating 
the response based on a desired response cri- 30 
teria, and providing feedback to the individual 
using at least one of the display device (64) and 
the speech synthesizer (74). 

9. The system of claim 8 wherein the at least one input 35 
device comprises a microphone (72) and wherein 

the control logic implements speech recognition to 
evaluate the response based on the desired re- 
sponse criteria. 

40 

10. The system of claim 8 wherein the control logic (60) 
communicates with the speech synthesizer (74) 
and the display device (64) to present an audio stim- 
ulus to the individual including a synthesized utter- 
ance and a visual representation of the at least one ^ 
unfamiliar vocabulary element. 
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FIG. 1. 
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FIG. 2. 
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FIG. 3. 
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